4 research outputs found
Reliable Decision from Multiple Subtasks through Threshold Optimization: Content Moderation in the Wild
Social media platforms struggle to protect users from harmful content through
content moderation. These platforms have recently leveraged machine learning
models to cope with the vast amount of user-generated content daily. Since
moderation policies vary depending on countries and types of products, it is
common to train and deploy the models per policy. However, this approach is
highly inefficient, especially when the policies change, requiring dataset
re-labeling and model re-training on the shifted data distribution. To
alleviate this cost inefficiency, social media platforms often employ
third-party content moderation services that provide prediction scores of
multiple subtasks, such as predicting the existence of underage personnel, rude
gestures, or weapons, instead of directly providing final moderation decisions.
However, making a reliable automated moderation decision from the prediction
scores of the multiple subtasks for a specific target policy has not been
widely explored yet. In this study, we formulate real-world scenarios of
content moderation and introduce a simple yet effective threshold optimization
method that searches the optimal thresholds of the multiple subtasks to make a
reliable moderation decision in a cost-effective way. Extensive experiments
demonstrate that our approach shows better performance in content moderation
compared to existing threshold optimization methods and heuristics.Comment: WSDM2023 (Oral Presentation